Method for predicting and diagnosing trouble in a connection having a plurality of components

ABSTRACT

A computer-executable method for diagnosing trouble causes in telephone cable segments when trouble data are available only for entire cables is presented. This method diagnoses problems in the components of processes when the component location of process problems is not known. It gives a statistical relationship between component attributes and process problem rates, to facilitate assessment and improvement of policies about the process. This method may be used for diagnosing systematic problems in segments of telephone cable connections of given central offices.

This application is a continuation of Ser. No. 09/317,288 filed May 24, 1999, now U.S. Pat. No. 6,393,102.

BACKGROUND OF THE INVENTION

1. Field of the Invention

This application relates to use of a computer system for trouble diagnosis and performance forecasting, and more particularly to techniques executed in a computer system for performance fitting, diagnosing, and forecasting of telephone connections based on curve fitting of data points involving telephone cable characteristics.

2. Description of Related Art

Numerous types of procedures involve many steps which are associated with particular components of the procedures. Each component in turn may have particular characteristics which, along with other factors, dictate the performance of each component. Having information related to component performances is advantageous for making diagnoses of trouble, minimizing costs, making repairs, and suggesting new, more efficient alternatives. Sometimes, however, knowing the performance of each component is neither easy nor even desirable, even if all the performances could be known, because this amount of information would be too unwieldy. Instead, the performances of the components of a realization may be mapped to an overall performance of the realization. The overall performance gives an indication of the global success of the procedure.

A telephone cable connection, as well as other multi-component processes, is an example where performance or trouble incidence is generally recorded for the entire connection, rather than the individual components. Common analysis of trouble involves an aggregate categorization of trouble type for each connection. However, because, in general, each connection may have a different configuration, these common approaches to trouble analysis preclude a diagnosis of its components. Thus, an intuitive and anecdotal examination of these aggregate-trouble frequencies are sometimes used to subjectively assess component repair and replacement policies.

More objective approaches, such as the classical technique of using logistic regression to model cable trouble rates, do exist. However, although superficially relating holistic cable trouble and aggregate segment characteristics, these approaches do not account for the probabilistic structure relating cable trouble to component trouble, and do not allow an appropriate mathematical form for component trouble.

These traditional aggregate techniques do not objectively inform managers of the effects of their policies and tactics for individual components. The characteristics of the components cannot be related to trouble types and rates. The logistic regression solution does not appropriately exploit the relationship in trouble probabilities among the cable components. Although these techniques might be augmented by using more sophisticated nonlinear regression analysis, the high data storage and computational power requirements may be prohibitive.

SUMMARY OF THE INVENTION

According to the present invention, statistical relationships between aggregate troubles and process (e.g., cable) components are constructed in a manner that facilitates the fitting of the overall performance to field data with a relatively small number of free parameters. Such fitting can aid in the diagnosing and forecasting of performance problems that can be used to make policy decisions pertaining to what apparatus and technology to use. These decisions may then be evaluated by proper bench marking of the apparatus. The present method has sufficiently low data storage and computational demands to permit its implementation on virtually any microcomputer.

These desirable attributes of the present method are achieved by a judicious choice of a trial function, which relates component characteristics and component performances, to be used for data fitting. A good choice for such a function would lead to a number of useful functions of the characteristics instead of all the characteristics themselves. Because the number of such functions of the characteristics is considerably less than the total number of characteristics across a connection, a reduction in the data required to perform the curve fitting may be possible.

Specifically, a method executed in a computer system for diagnosing, and predicting trouble causes in a process involving realizations of performance is presented. The ith realization has n, steps and the performance is a function of a c-component vector of characteristics, the performance and vector having a value of θ_(i)[j, x_(i)(j)] and x_(i)(j), respectively, at the jth step of the ith realization. (The component of a vector should not be confused with the component of a connection; the former refers to each of the numbers in an ordered set defining a column or row vector, the latter is associated with each of the nj steps, of connection i.) The method comprises a) expressing an overall performance, ψ_(i) as a function, φ_(i), of the performance at each of the n_(i) steps, ψ_(i)=Φ_(i){θ_(i) ^(−[1, x) _(i)(1)], θ_(i) ⁻[2, x_(i)(2)], . . . , θ_(i)[n_(i), x_(i)(n_(i))]}; b) choosing a function, f, of a d-component vector of parameters, β, and the c-component vector of characteristics with which to approximate the performances, θ_(i)[j, x_(i)(j)]≈f[β, x_(i)(j)]. The function f is chosen such that the overall performance, ψ_(i), can be written as a function, γ_(i), of the d-component vector of parameters and r c-component vectors, S_(i,1), s_(i,2), . . . , and s_(i,r), that depend on the n_(i), characteristic vectors x_(i)(1), x_(i)(2), . . . , and x_(i)(n_(i)), $\begin{matrix} {\psi_{i} = \quad {\varphi_{i}\left\{ {{f\left\lbrack {\beta,{x_{i}(1)}} \right\rbrack},{f\left\lbrack {\beta,{x_{i}(2)}} \right\rbrack}\quad,\ldots \quad,{f\left\lbrack {\beta,{x_{i}\left( n_{i} \right)}} \right\rbrack}} \right\}}} \\ {\equiv \quad {\gamma_{1}\left\{ {\beta,{s_{i,1}\left\lbrack {{x_{i}(1)},{x_{i}(2)},\ldots \quad,{x_{i}\left( n_{i} \right)}} \right\rbrack},{s_{i,2}\left\lbrack {{x_{i}(1)},} \right.}} \right.}} \\ \left. {{\quad \left. {{x_{i}(2)},\ldots \quad,{x_{i}\left( n_{i} \right)}} \right\rbrack},\ldots \quad,{s_{i,r}\left\lbrack {{x_{i}(1)},{x_{i}(2)},\ldots \quad,{x_{i}\left( n_{i} \right)}} \right\rbrack}} \right\} \end{matrix}$

where r<n_(i); c) finding a best vector of parameters that results in a best fit of the function γ,(β, s_(i,1), s_(i,2), . . . s_(i,r)) to data points corresponding to the overall performance as a function of the r c-component vectors. s_(i,1), s_(i,2), and s_(i,r).

In a specific embodiment, the method further comprises predicting the overall performance of a realization from r c-component vectors that correspond to s_(i,1), s_(i,2), . . . , and s_(i,r) by utilizing the best vector of parameters.

In another embodiment, the step of choosing a function, f, of a d-component vector of parameters, β, and the c-component vector of characteristics, includes choosing a function, f, of a c-component vector of parameters, β, and the c-component vector of characteristics.

In a specific embodiment, the method includes expressing the overall performance, ψ_(i), as ψ_(i)=θ_(i)[1, x_(i)(1)]θ_(i)[2, x_(i)(2)]. . . θ_(i)[n_(i), x_(i)(n_(i))], and choosing the function f=exp[β·x_(i)(j)], where β·x_(i)(j) denotes an inner product of β and x_(i)(j).

In an embodiment of the invention, fitting the function γ_(i)(β, s_(i,1), s_(i,2), . . . , s_(i,r)) proceeds by utilizing a microprocessor, and a regression algorithm, which may employ the Levenberg-Marquardt method, to estimate the vector of parameters, β, that approximately maximizes agreement between the function γ_(i)(β, s_(i,1), s_(i,2), . . . , s_(i,r)) and the data points. The Levenberg-Marquardt method may utilize a figure-of-merit function that measures this agreement.

In an other embodiment of the present invention, a method executed in a computer system for predicting trouble in a telephone connection is presented comprising a) characterizing a jth component of a sequence of n_(i) components of an ith telephone connection by a c-component vector of characteristics x_(i)(j); b) assigning a performance θ_(i)[j, x_(i)(j)] to the jth component of the ith telephone connection; c) approximating the performance as θ_(i)[j, x_(i)(j)]=exp[β·x_(i)(j)], where β is a c-component vector of parameters and β·x_(i)(j) denotes an inner product of β and x_(i)(j); d) defining an overall performance of the ith telephone connection as the product of the performances over the components of the ith realization, ${\psi_{i} = {\prod\limits_{j = 1}^{ni}\quad {\theta_{i}\left\lbrack {j,{x_{i}(j)}} \right\rbrack}}};$

e) varying the vector of parameters β to find a best vector of parameters that fits a curve $\begin{matrix} {\psi_{i} = {\exp \left\lbrack {\beta \cdot {\sum\limits_{j = 1}^{ni}\quad {x_{i}(j)}}} \right\rbrack}} \\ {{= {\exp \left( {\beta \cdot s_{1}} \right)}},} \end{matrix}$

where s_(i)=x_(i)(1)+x_(i)(2)+ . . . +x_(i)(n_(i)), through data points corresponding to the overall performance versus s_(i).

In another embodiment, this last method further comprises predicting the overall performance of a realization from a c-component vector, that corresponds to s_(i), by utilizing the best vector of parameters.

Also presented is a system for diagnosing, and predicting trouble causes in a process involving realizations of performance. The ith realization has n_(i) steps and the performance is a function of a c-component vector of characteristics. The performance and c-component vector have a value of θ_(i)[j, x_(i)(j)] and x_(i)(j), respectively, at the jth step of the ith realization, comprising a) a computer; b) instructions for the computer to express an overall performance, ψ_(i), as a function, φ_(i), of the performance at each of the n_(i) steps, ψ_(i)=φ_(i){θ_(i)[1, x_(i)(1)], θ_(i)[2, x_(i)(2)], . . . , θ_(i)[n_(i), x_(i)(n_(i))]}; c) instructions for the computer to choose a function, f, of a d-component vector of parameters, β, and the c-component vector of characteristics with which to approximate the performances, θ_(i)[j, x_(i)(j)]≈f[β, x_(i)(j)], such that the overall performance, ψ₁, can be written as a function, γ_(i), of the d-component vector of parameters and r c-component vectors, s_(i,1), s_(i,2), . . . , and s_(i,r), that depend on the n_(i) characteristic vectors x_(i)(1), x_(i)(2), . . . , and x_(i)(n_(i)), $\begin{matrix} {\psi_{i} = \quad {\varphi_{i}\left\{ {{f\left\lbrack {\beta,{x_{i}(1)}} \right\rbrack},{f\left\lbrack {\beta,{x_{i}(2)}} \right\rbrack},\ldots \quad,{f\left\lbrack {\beta,{x_{i}\left( n_{i} \right)}} \right\rbrack}} \right\}}} \\ {\equiv \quad {\gamma_{i}\left\{ {\beta,{s_{i,1}\left\lbrack {{x_{i}(1)},{x_{i}(2)},\ldots \quad,{x_{i}\left( n_{i} \right)}} \right\rbrack},} \right.}} \\ \left. \quad {{s_{i,2}\left\lbrack {{x_{i}(1)},{x_{i}(2)},\ldots \quad,{x_{i}\left( n_{i} \right)}} \right\rbrack},\ldots \quad,{s_{i,r}\left\lbrack {{x_{i}(1)},{x_{i}(2)},\ldots \quad,{x_{i}\left( n_{i} \right)}} \right\rbrack}} \right\} \end{matrix}$

where r<n_(i); and d) instructions for the computer to find a best vector of parameters that results in a best fit of the function γ_(i)(β, s_(i,1), s_(i,2), . . . , s_(i,r)) to data points corresponding to the overall performance as a function of the r c-component vectors, s_(i,1), s_(i,2), . . . , and s_(i,r).

In another embodiment, in the previous system for diagnosing, and predicting trouble causes, an overall performance of a process with given characteristics is forecast by using the function γ_(i)(β, s_(i,1), s_(i,2), . . . , s_(i,r)) with the best vector of parameters.

In yet another embodiment, a system is presented for predicting trouble in a telephone connection. A jth component of a sequence of n_(i) components of an ith telephone connection is characterized by a c-component vector of characteristics x_(i)(j). The system comprises a) a computer; b) instructions to the computer to assign a performance θ_(i)[j, x_(i)(j)], that depends on the vector of characteristics, to the jth component of the ith telephone connection; c) instructions to the computer to (i) approximate the performance as θ_(i)[j, x_(i)(j)]=exp[β·x_(i)(j)], where β is a c-component vector of parameters and β·x_(i)(j) denotes an inner product of β and x_(i)(j), and (ii) define an overall performance of the ith telephone connection as the product of the performances over the components of the ith realization, ${\psi_{i} = {\prod\limits_{j = 1}^{ni}\quad {\theta_{i}\left\lbrack {j,{x_{i}(j)}} \right\rbrack}}};\quad {and}$

e) means for varying the vector of parameters β to find a best vector of parameters to fit a curve $\begin{matrix} {\psi_{i} = {\exp \left\lbrack {\beta \cdot {\sum\limits_{j = 1}^{ni}\quad {x_{1}(j)}}} \right\rbrack}} \\ {{= {\exp \left( {\beta \cdot s_{1}} \right)}},} \end{matrix}$

where s_(i)=x_(i)(1)+x_(i)(2)+ . . . +x_(i)(n_(i)), to data points corresponding to the overall performance versus s_(i).

In another embodiment, a system to diagnose trouble causes in telephone cable segments when trouble data are available only for entire cables is presented. The system includes a computer; means for approximating an overall performance of a telephone connection depending on n c-component vectors of characteristics so that the overall performance can be modeled by using a function with adjustable c-component vectors of parameters, the vectors of parameters being less in number than n; and means for varying the adjustable c-component vectors of parameters to find best-fit vectors of parameters to fit a curve to data points. The means for varying the adjustable c-component vectors of parameters can include using a regression algorithm. The diagnosis proceeds by examining and interpreting the vector of parameters β.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1a is a perspective view of a computer system for implementing the present invention;

FIG. 1b is an example of a computer network used to process code pertaining to the present invention.

FIG. 2 is an example of an embodiment showing three components of an i* realization, and the mapping of performances to the overall performance.

In FIG. 3 is an example of some characteristics of cable components in one embodiment of the present invention pertaining to telephone connections. The five characteristics.shown are cable age, type, length, size, and gauge.

FIG. 4 is a flowchart of method steps of an embodiment of how fitting, and predicting of the overall performance proceeds using performance/characteristics data.

FIG. 5 is an example of computer prompts for data, sample input data, and the computer output.

FIG. 6 is a representative input data file for the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Procedures that satisfy particular business needs may include a sequence of steps. Each step is associated with a particular component of the procedure. For example, a service order may include a sequence of departmental interventions in a company, and a land line telephone connection may include a sequence of cables that are electrically connected to establish telephone service to a home. With each component of the procedure is associated a performance. The performance, however, may depend on a countless number of factors, some of which are unknown. It is no surprise, then, that the jth step of any two invocations or realizations of the procedure may have different performances even if the two realizations involve the same components. A telephone call to a person, for example, may be accompanied by electrical noise even if moments earlier a telephone call to the same person over the same telephone lines was successful. A fortiori, if different components are involved in the two realizations—in the previous example, this would be the case if two persons at different households were called—then the performance of the jth step may generally be different.

These types of procedures may be treated as a stochastic process. In particular, the performance, denoted by θ(j), could be considered to be a random variable that is a function of the step j. The total number of steps n_(i) of the ith realization can, in principle, be arbitrarily large, and be different from other realizations. In the example of a telephone connection, two realizations of a call from location A to location B may be obtained either by making two telephone calls using the same connection or by making two telephone calls using different connections. In the latter case, the number of steps or components to complete the two calls may be different. Moreover, two realizations may start out at A, but end up at different locations, B and C. Again, these two realizations may have a different number of total steps. The performance, if directly measurable at all, may be determined probabilistically. That is, it may be assumed that there is a probability density, p, with the property that p[θ(1), θ(2), . . . , θ(n_(i))]dθ(1)dθ(2) . . . dθ(n_(i)) is the probability that the performance takes on a value between θ(1) and θ(1)+dθ(1) at step 1, θ(2) and θ(2)+dθ(2) at step 2, . . . , and θ(n_(i)) and θ(n_(i))+dθ(n_(i)) at step n_(i). Sometimes the performances are a function of a non-random, c-component vector of characteristics, x_(i)(j), that is a function of the step j. The functional form of these characteristics is also assumed to depend on the realization, hence the subscript i. In the example of telephone connections, the relevant characteristics might include the age of the telephone cables-involved in the connection, as well as the type, length, size, and gauge of the cables.

Sometimes, it is not the actual realizations that are of concern, but rather some function of the realizations. An example of such a function is the overall performance: designating the ith realization of the performance by θ_(i)[j, x_(i)(j)], the overall performance of the ith realization may be denoted by

ψ_(i)=φ_(i){θ_(i)[1, x _(i)(j)], θ_(i)[2, x _(i)(j)], . . . , θ_(i) [n _(i) , x _(i)(j)]}  (1)

for some appropriate function φ_(i). The overall performance maps the sequence {θ_(i)[1, x_(i)(j)], θ_(i)[2, x_(i)(j)], . . . , θ_(i)[n_(i), x_(i)(j)]} for a particular realization i to a number φ_(i){θ_(i)[1, x_(i)(j)], θ_(i)[2, x_(i)(j)], . . . , θ_(i)[n_(i), x_(i)(j)]}, which attempts to capture, in a global sense, how well the various components are functioning. A simple example of an overall performance would be the average of the performances over the number of steps, n_(i), in a realization.

An actual realization may be a complicated function of the step j. Sometimes it might be useful to try to fit a curve through each of the realizations. This, however, is sometimes difficult because too many parameters would be needed for the curve fit, and, moreover, the performances, θ_(i), may not even be known. Instead of attempting to fit the performances, θ_(i), it may be more profitable to first approximate the θ_(i) using a trial function, f, having some free parameters. The type of trial function chosen is one that reduces the number of these free parameters needed to perform the curve fit. The overall performance may then be obtained using this approximation for the performances. Finally, this expression for the overall performance may be fitted to performance-characteristic data by varying the free parameters.

A trial function f is chosen that reduces the number of free parameters needed for the curve fit. The fit can proceed by choosing a function f of a d-component vector of parameters β, and the c-component vector of characteristics, with which to approximate the performances,

θ_(i) [j, x _(i)(j)]≈f[β, x _(i)(j)].  (2)

The trial function may be chosen so that the overall performance, ψ_(i), can be written as a function, γ_(i), of the d-component vector of parameters, β, and r c-component vectors, s_(i,1), s_(i,2), . . . , and s_(i,r), that depend on the n_(i) characteristic vectors x_(i)(1), x_(i)(2), . . . , and x_(i)(n_(i)). In mathematical symbols, $\begin{matrix} {{\psi_{i} = \quad {\varphi_{i}\left\{ {{f\left\lbrack {\beta,{x_{i}(1)}} \right\rbrack},{f\left\lbrack {\beta,{x_{i}(2)}} \right\rbrack},\ldots \quad,{f\left\lbrack {\beta,{x_{i}\left( n_{i} \right)}} \right\rbrack}} \right\}}}\quad} & {\quad (3)} \\ {\equiv \quad {\gamma_{i}\left\{ {\beta,{s_{i,1}\left\lbrack {{x_{i}(1)},{x_{i}(2)},\ldots \quad,{x_{i}\left( n_{i} \right)}} \right\rbrack},} \right.}} & {\quad (4)} \\ {\quad {{s_{i,2}\left\lbrack {{x_{i}(1)},{x_{i}(2)},\ldots \quad,{x_{i}\left( n_{i} \right)}} \right\rbrack},\ldots \quad,}} & \quad \\ \left. \quad {s_{i,r}\left\lbrack {{x_{i}(1)},{x_{i}(2)},\ldots \quad,{x_{i}\left( n_{i} \right)}} \right\rbrack} \right\} & \quad \end{matrix}$

where r<n_(i). From this last equation, it follows that ψ_(i) may be expressed as a function of r independent vectors s_(i,1), s_(i,2), . . . , and s_(i,r) instead of the n_(i) vectors x_(i)(1), x_(i)(2), . . . , and x_(i)(n_(i)). Since r<n_(i), one would expect that the number of parameters needed to fit the overall performance would be reduced.

Given data points of s_(i) versus overall performance, it is possible to fit the curve γ_(i) through these points by varying the parameters β. A regression algorithm, as may be found in commercial software packages, can find the best vector of parameters that provides a best fit. With these best vector of parameters in hand, it is possible to diagnose and predict the overall performance given r c-component vectors that correspond to s_(i,1), s_(i,2), . . . , and s_(i,r).

In the application discussed below, for example, the performance ψ_(i), is the probability of correct functioning of a telephone cable connection i composed of many cable components connected in series. Then if θ_(i)[j, x_(i)(j)] is the probability of the correct functioning of component j of connection i, it follows from the series connection of the components that $\begin{matrix} {\psi_{i} = {\prod\limits_{j = 1}^{nj}\quad {\theta_{i}\left\lbrack {j,{x_{i}(j)}} \right\rbrack}}} & (5) \end{matrix}$

To ultimately estimate the overall performances, it is then convenient to choose $\begin{matrix} {{\theta_{i}\left\lbrack {j,{x_{i}(j)}} \right\rbrack} \approx \quad {f\left\lbrack {\beta;{x_{i}(j)}} \right\rbrack}} & {\quad (6)} \\ {{{= \quad {\exp \left\lbrack {\beta \cdot {x_{i}(j)}} \right\rbrack}},}\quad} & {\quad (7)} \end{matrix}$

where β·x_(i)(j) denotes the inner product of β and x_(i)(j), and it was assumed that the vectors β, and x_(i) belong to vector spaces of the same dimension, i.e., c=d. Note, however, that in general d may be larger than c if interactions are included. In addition, the choice of the function f implies that the j dependence of the performance enters through x_(i)(j), although in general the performance can also explicitly depend on the step j.

To estimate the overall performance, ψ_(i), Eq. (7) may be substituted into Eq. (5) to yield $\begin{matrix} {\psi_{i} = {\prod\limits_{j = 1}^{ni}{{\exp \left\lbrack {\beta \cdot {x_{i}(j)}} \right\rbrack}\quad (8)}}} \\ {= {{\exp \left\lbrack {\beta \cdot {\sum\limits_{j = 1}^{ni}\quad {x_{1}(j)}}} \right\rbrack}\quad (9)}} \\ {{= {\exp \left( {\beta \cdot s_{1}} \right)}},\quad (10)} \end{matrix}$

where the c-component vector si is defined by $\begin{matrix} {s_{i} = {\sum\limits_{j = 1}^{ni}\quad {{x_{i}(j)}.}}} & (11) \end{matrix}$

Using Eq. (10), it may be concluded that it is not necessary to know the individual characteristics at each step. Instead, it is the sum of the characteristics, s_(i), that enter the computation. (In this example, r of Equation (4) is thus equal to one.) This is a valuable feature of the present formulation which also permits a reduction in the number of parameters needed to fit the overall performance. Being able to straightforwardly choose the parameters β to achieve a good curve fit, with a standard statistical regression software package, is another advantage of this formulation of the process diagnosis problem. Still another is that a data set of summarized characteristics can often be used as inputs to the statistical routines, which may be an important feature in situations where the number of components n_(i) is large.

The reliability of a telephone connection from a residence to a local telephone company's central office may perhaps be the central quality determinant for a public switched telephone network. Each of these connections consists of a varying number of components (cables) which can be one of a handful of types (e.g., air-core or gelatin-filled), three basic placement options (aerial, buried or underground conduit), and a wide variety of sizes and lengths. For any given residence, company databases may indicate the characteristics of the cable components which constitute that connection. Generally, though, connection troubles are recorded and often defined only for connections as a whole, and not for a particular component.

For cable trouble diagnosis, Eq. (10) permits considerable data and parameter reduction. Instead of requiring each of 5-10 characteristics for each of 20-100 components per cable, for example, our model requires only the total number of cable components of a particular type across the entire cable, and (for diagnosing the effect of component age) the sum of the ages of components across the entire cable. Also, instead of requiring a set of parameters for each cable component, it is sufficient to estimate parameters only for the set of cable types and for age within type. (The exact number of parameters depends, as in most statistical regression model fitting, on the exact form of the postulated model. The number of parameters will, generally, be much smaller than the number of components per line.)

When company management wishes to evaluate policy regarding the engineering, installation and preemptive replacement practices for these cables, it becomes vital to be able to make general statistical statements about the effect of cable characteristics on trouble frequency, even though no direct information is available. A cable characteristic that may be important is its age (i.e., the number of years since its installation), for its effect is dynamic and inevitable. Although other cable characteristics, such as gauge and size, have an effect on trouble rates, it is therefore helpful to fix our attention on the effect of cable age on the many types and placements of cable.

This can be understood as a process diagnosis problem in the following way. The process is a central office's provision of working local telephone connections over a particular time period, say one year. The realizations of that process are the individual connections to the residences associated with that central office. Note that the realizations, then, are arranged spatially rather than time-wise. Each component of a particular realization is a piece of cable, each of whose characteristics (type, placement, age, length, gauge, etc.) is known. A measure of related to the overall performance is the probability of cable-related trouble, Pr(CableTrouble) on each connection in the 12-month window. For connection i and component j, one may define the overall performance as $\begin{matrix} {\psi_{i} = \quad {1 - {\Pr ({CableTrouble})}}} & {\quad (12)} \\ {{= \quad {\prod\limits_{j = 1}^{ni}\quad {\theta_{i}\left\lbrack {j,{x_{i}(j)}} \right\rbrack}}}\quad} & {\quad (13)} \end{matrix}$

where θ_(i)[j, x_(i)(j)] is the probability of no cable trouble in component j of connection i. This follows as the cable segments are always arranged in a series structure. The entire connection can be trouble-free if and only if each of its components is trouble-free during the time-window in question.

Quality improvement for cable maintenance requires the examination of such systematic policies as choice of cable type, ongoing cable upgrade programs, training, cable installation practice and the anticipation of the effects of increasing cable age. Many such company issues can be diagnosed by relating the probability of a problem in component j of cable i to such characteristics as cable type, age, gauge and size, as well as to its maintaining central office (and its idiosyncratic practices). Thus, for the diagnosis of trouble causes in telephone cable segments

θ_(i) [j, x _(i)(j)]=exp[β_(CO)CO_(i)(j)+β_(Ctype)CType_(i)(j)+β_(Age)Age_(i)(j)+{(OtherCharacteristics_(i)(j)}]  (14)

where CO_(i)(j) is the Central Office maintaining component j of the ith connection,

CType_(i)(j) is the Cable Type of component j of the ith connection,

Age_(i)(j) is the age of component j of the ith connection, and

{OtherCharacteristics_(i)(j)} include length, size and gauge of component j, all for the ith connection.

This formulation uses a statistical decomposition of overall cable trouble rates to specify characteristics which are trouble causes. It exploits the great size and variety of cables in company databases to differentiate among alternative trouble causes. The resulting statistical (i.e., not location-specific) identifiers of trouble root causes are well suited to managerial decision-making.

The above description will now be illustrated by reference to the accompanying drawings. Referring to FIG. 1, a computer system 10 may be used to implement the present invention. In a preferred embodiment, the computer system 10 may include a Gateway 2000 computer, model P5-90, with a 90 MHz clock speed, 32 MB of RAM, and a 100 MB hard drive. The system 10 may be used to condense and summarize performance data of multi-component procedures, such as telephone connections, by fitting the data to a model that depends on adjustable parameters. Although the computer system 10 is shown for the purpose of illustrating a preferred embodiment the present invention is not limited to the particular computer system 10 shown, but may be used on any electronic processing system having sufficient performance and characteristics (such as memory) to provide the functionality described herein.

The computer system 10 may include a microprocessor-based unit 11 for receiving and processing software programs and for performing other processing functions. A keyboard 12 may be connected to the microprocessor-based unit 11 for permitting a user to input information to the software. As an alternative to using the keyboard 12 for input, a mouse 13 may be used for moving a selector 14 on a display 15 and for selecting an item on which the selector 14 overlays, as is well known in the art. A floppy disk 16 may also include a software program, and is inserted into the microprocessor-based unit 11 for inputting the software program. Still further, the microprocessor-based unit 11 may be programmed, as is well known in the art, for storing the software program internally. A printer 17 may be connected to the microprocessor-based unit 11 for printing a hard copy of the output of the computer system 10.

The software used to implement the present invention, which can be written in a high-level language-like C, Fortran, or Pascal, may be stored on a hard drive (not shown) located within the microprocessor-based unit 11. The software should be capable of interfacing with any internal or external subroutines, such as regression packages used to perform curve fitting. In one preferred embodiment, the SSPS statistical package (see below) may be used to perform regression analysis. In a preferred embodiment, the memory size of the computer system 10 is large enough to accommodate regression analysis software and to hold the relevant field data. In one preferred embodiment, approximately 2-3 MB of telephone cable field data, stored on a floppy disk 16 or a hard drive, may be used. For smaller amounts of field data, the software may use the display 15 to prompt the user to input relevant field data.

In other embodiments, the microprocessor used to process computer code pertaining to the present invention can be part of a network. For example, FIG. 1b illustrates a network computer system 114. The computer system 114 is shown to include a plurality of computer processors or nodes (201-204); connected to a network 110 by network interface connections 121-124, respectively. Particular nodes, such as the nodes 201, 202, may communicate using the network 110 over network connections 121, 122, respectively. The nodes 203, 204 similarly may communicate with other nodes using the network 110 through respective network interface connections 123, 124, respectively.

It should be noted that the hardware of the various nodes and network that may be included may vary with application and use. A conventional computer system, as well as a special manufactured computer system for a particular application, may be used in a preferred embodiment of the invention. Similarly, an embodiment may include any type of network 110 required for a particular application. A preferred embodiment of the invention may include no network but may also reside on a standalone computer system, as shown in FIG. 1a, with software loaded into the system via a storage medium and device, such as a CD-ROM or disk drive.

FIG. 2 depicts a particular realization i* of the stochastic process that is a sequence of performances. Three components are shown as elements 22, 24, and 26, corresponding to the first, second, and final n_(i*)th step. The performances, evaluated at each of the n_(i*) steps {θ_(i*)[1, x_(i*)(1)], θ_(i*)[2, x_(i*)(2)], . . . , θ_(i*)[n_(i*), x_(i*)(n_(i*))]}, are mapped to the overall performance ψ_(i*) 28.

In one embodiment of the present invention, the stochastic process involves the performance of telephone operation. A realization of this stochastic process is then a particular telephone connection. Each connection is composed of a sequence of connections associated with the steps involved in establishing the telephone connection. Each component in this embodiment may include a cable and other electrical hardware. The performance of each cable component is a function of the characteristics of the cable.

In FIG. 3 is shown some characteristics of cable components in one embodiment of the present invention pertaining to telephone connections. The particular realization shown is for the jth step 32 of the i* realization 34. Five types of characteristics are shown, although this number may vary in other embodiments. Accordingly, the vector of characteristics x_(i*)(j) has five components. (The component of a vector should not be confused with the component of a connection; the former refers to each of the numbers in an ordered set defining a column or row vector, the latter is associated with each of the n_(i) steps of connection i.) The five characteristics shown in FIG. 3 are cable age 36, type 37, length 38, size 39, and gauge 40.

FIG. 4 is a flowchart of an embodiment of the present invention indicating how fitting and predicting of the overall performance proceeds using performance/characteristics data. In the first stage 41, each of the r c-component vectors, s_(i,1), s_(i,2), . . . , and s_(i,r), that depend on the n_(i) characteristic vectors x_(i)(1), x_(i)(2), . . . , and x_(i)(n_(i)), is assigned. In the second stage 42, the overall performance, ψ_(i), is expressed in terms of a vector of parameters β and the r vectors s_(i,1), s_(i,2), . . . , and s_(i,r). In stage 43, a regression routine is used to obtain a best vector of parameters that fits the overall performance to performance/characteristic data. In the final stage 44, the best vector of parameters is used to predict the overall performance for r vectors corresponding to s_(i,1), s_(i,2), . . . , and s_(i,r).

The implementation of this regression routine will now be outlined for a particular embodiment of the present invention. Equation (10) has the following form for a specific implementation in which the cable connection under examination is composed of components of three different types, and interest centers on the effects of the age of those types:

CabTrb1=exp(bCO#*CO#+bCTyp1*n_CTyp1+bageCTyp1*ageCTyp1+bCTyp2*n_CTyp2+bageCTyp2*ageCTyp2+bCTyp3*n_CType3+bageCTyp3*ageCTyp3)+error

where:

CabTrb1=1(0) if the ith cable connection has (has not had) cable trouble

CO#=1(0) if ith cable connection emanates (does not emanate) from Central Office #

n_CTyp1=number of cable components of type CTyp1 in cable i

ageCTyp1=sum of ages of Ctyp1 components in cable connection i

n_CTyp2=number of cable components of type CTyp2 in cable connection i

ageCTyp2=sum of ages of Ctyp2 components in cable connection i

n_CTyp3=number of cable components of type CTyp3 in cable connection i

ageCTyp3=sum of ages of Ctyp3 components in cable connection i are characteristics of cable connection i. The nonlinear regression estimates the following free parameters:

bCO#

bCTyp1

bCTyp2

bCTyp3

bageCTyp1

bageCTyp2

bageCTyp3

which are interpreted in the usual regression sense, i.e., the incremental change in the term in parentheses on the right-hand side of the above equation for a one-unit change in each component's associated characteristic. For example, the parameter bageCTyp1 is the effect (measured as a multiplicative factor of exp(bageCTyp1) of an increase of one year of total age in the components of cable type Ctyp1 in each cable connection in the population under examination. The term “error” is a term which allows for any discrepancies between the cable trouble which is actually observed (either a 0 or 1) and the function on the right-hand side of the equation (which is continuous). As is standard in regression problems, this term is not estimated for any individual cable connection, but is simply included in to balance the equation.

FIG. 5 is an example of a computer screen 50 showing computer prompts for data, sample input data, and the computer output. The user is presented with a request 51. In this case, the user enters a 1(0) if cable connection 1 has (has not had) cable trouble 52, a 1(0) if the ith cable connection emanates (does-not emanate) from Central Office # 53, the number of cable components of type CTyp1 in cable connection 1 54, the sum of ages of CTyp1 components in cable connection 1 55, the number of cable components of type CTyp2 in cable connection 1 56, the sum of ages of CTyp2 components in cable connection 1 57, the number of cable components of type CTyp3 in cable connection 1 58, and the sum of ages of Ctyp3 components in cable connection 1 59. The user hits the return key to indicate to the software that data have been selected for cable connection 1. The user is then prompted to enter similar data for cable connection 2 followed by hitting the return key. After the data for all cable connections have been entered, the computer outputs the best-fit parameters corresponding to bCO# 60, bCTyp1 61, bCTyp2 62, bCTyp3 63, bageCTyp1 64, bageCTyp2 65, and bageCTyp3 66.

Alternatively, instead of inputting data at the keyboard, data can be accessed from a data file on, for example, a hard drive, or CD-ROM. Using data files instead of inputting data at the keyboard is convenient if large amounts of data is to be processed. FIG. 6 is a representative input data file 500 including characteristics of twenty-eight telephone connections. The first column 501 lists the realization number. The second column 502 contains entries of a 1(0) if the respective cable connection has (has not had) cable trouble. The third column 503 contains entries of a 1(0) if the respective cable connection emanates (does not emanate) from Central Office 1. The fourth column 504 contains entries corresponding to the number of cable components of type CTyp1 in the respective connection. The fifth column 505 contains entries corresponding to the sum of ages of CTyp1 components in the respective connection. The sixth column 506 contains entries corresponding to the number of cable components of type CTyp2 in the respective connection. The seventh column 507 contains entries corresponding to the sum of the ages of CTyp2 components in the respective connection. The eighth column 508 contains entries corresponding to the number of cable components of type CTyp3 in the respective cable connection. A final column (not shown) could contain entries corresponding to the sum of the ages of Ctyp3 in the respective cable connection.

The parameters for the model can be estimated by any one of several known regression algorithms. In one embodiment, a particular nonlinear least-squares routine known as the Levenburg-Marquardt Method is used. One convenient software package that makes this method available is provided by SPSS, Inc., 444 N. Michigan Ave., Chicago, Ill. The SPSS package allows the user to enter macro commands to invoke the least-squares fit. In one embodiment, the file containing the input field data is first opened and the following commands are issued:

MODEL PROGRAM BCO#=0 BCTyp1=0

BAGECTyp1=0 BCTyp2=0 BAGECTyp2=0 BCTyp3=0 BAGECTyp3=0 BGAUGE=0 BLGTH=0

COMPUTE PRED_=exp(bCO#*CO#+bcTyp1*n_Ctyp1+bageCTyp1*ageCTyp1+bCTyp2*n_Ctyp2+bageCTyp2*ageCTyp2+bCTyp3*n_Ctype3+bageCTyp3*ageCTyp3)

NLR cabprob1

/OUTFILE=C:\TRIAL\TEM\SPSSFNLR.TMP′

/PRED PRED_(—)

/CRITERIA SSCONVERGENCE 1E-8 PCON IE-8

EXECUTE

The equations whose right-hand sides are zero represent parameter initializations. The equation corresponding to Eq. (10) appears as the fourth line that assigns PRED_. The fourth-last line directs the output to a particular file. The second-last line sets some error tolerances that dictate when the algorithm stops. This program executes a Levenberg-Marquardt fitting algorithm to estimate the parameters labeled in the COMPUTE line with a “B” as a first letter.

Although a Levenberg-Marquardt algorithm was used, it should be clear to someone of ordinary skill in the art that other regression or linear-square algorithms, whether or not these algorithms are used as part of a statistical package such as SPSS, may also be used in the present invention.

While the invention has been disclosed in connection with the preferred embodiments shown and described in detail, various modifications and improvements thereon will become readily apparent to those skilled in the art. Accordingly, the spirit and scope of the present invention are to be limited only by the following claims. 

What is claimed is:
 1. A method of predicting and diagnosing trouble in a connection having a plurality of components with a plurality of characteristics, the method comprising: selecting at least one characteristic common to the plurality of components as a basis for predicting and diagnosing trouble; calculating a first value for the selected characteristic common to the plurality of components; based on a performance history of the connection, providing a second value representing one of prior trouble or no prior trouble with the connection; and, determining a probability of trouble in the connection at a given time as a function of the first value and the second value, wherein determining includes expressing the second value as a function of the first value and a third value, and, approximating the third value to permit the function of the first value and the third value to approach the second value, wherein the third value represents the probability of trouble in the connection at a given time.
 2. The method of claim 1, wherein the connection comprises a telephone connection.
 3. The method of claim 1, wherein at least one component comprises a cable segment.
 4. The method of claim 3, wherein a characteristic of a component comprises at least one of cable age, cable type, cable length, cable size, and cable gauge.
 5. The method of claim 1, wherein determining comprises performing a regression calculation.
 6. A computer program product disposed on a computer readable medium for predicting and diagnosing trouble in a connection having a plurality of components with a plurality of characteristics, the computer program product comprising instructions for causing a processor to: use at least one characteristic common to the plurality of components as a basis for predicting and diagnosing trouble; calculate a first value for the used characteristic common to the plurality of components; receive a second value representing one of prior trouble or no prior trouble with the connection, the second value based on a performance history of the connection; and, determine a probability of trouble in the connection at a given time as a function of the first value and the second value, wherein the instructions for causing the processor to determine include instructions for causing the processor to express the second value as a function of the first value and a third value, and, approximate the third value to permit the function of the first value and the third value to approach the second value, wherein the third value represents the probability of trouble in the connection at a given time.
 7. The computer program product of claim 6, wherein the connection comprises a telephone connection.
 8. The computer program product of claim 6, wherein at least one component comprises a cable segment.
 9. The computer program product of claim 8, wherein a characteristic of a component comprises at least one of cable age, cable type, cable length, cable size, and cable gauge.
 10. The computer program product of claim 6, wherein the instructions for causing the processor to determine comprise instructions for causing the processor to perform a regression calculation. 